AITopics | with-replacement sampling and convexity

New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity

Neural Information Processing SystemsNov-20-2025, 22:22:00 GMT

As an incremental-gradient algorithm, the hybrid stochastic gradient descent (HSGD) enjoys merits of both stochastic and full gradient methods for finite-sum minimization problem. However, the existing rate-of-convergence analysis for HSGD is made under with-replacement sampling (WRS) and is restricted to convex problems. It is not clear whether HSGD still carries these advantages under the common practice of without-replacement sampling (WoRS) for non-convex problems. In this paper, we affirmatively answer this open question by showing that under WoRS and for both convex and non-convex problems, it is still possible for HSGD (with constant step-size) to match full gradient descent in rate of convergence, while maintaining comparable sample-size-independent incremental first-order oracle complexity to stochastic gradient descent. For a special class of finite-sum problems with linear prediction models, our convergence results can be further improved in some cases. Extensive numerical results confirm our theoretical affirmation and demonstrate the favorable efficiency of WoRS-based HSGD.

hybrid stochastic gradient descent, with-replacement sampling, with-replacement sampling and convexity, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Reviews: New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity

Neural Information Processing SystemsOct-7-2024, 11:48:03 GMT

This paper lies in the well-developed field of variance-reduced stochastic gradient algorithms. It proposes a theoretical study of the well-known HSGD for sampling without replacement scheme, which is known to perform better than sampling with replacement in practice. The considered setting makes the analysis of the algorithm more difficult, since in this case the mini-batch gradients are not unbiased anymore. Rates for strongly-convex, non-strongly convex and non-convex objectives are provided, with a special care for linearly structured problems (such as generalized linear models). Numerical experiments are convincing enough, although the datasets used in experiments are not very large (largest one is covtype) (while authors argue that this methods is interesting in the large n medium precision setting).

artificial intelligence, machine learning, with-replacement sampling and convexity, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity

Zhou, Pan, Yuan, Xiaotong, Feng, Jiashi

Neural Information Processing SystemsFeb-14-2020, 07:43:19 GMT

As an incremental-gradient algorithm, the hybrid stochastic gradient descent (HSGD) enjoys merits of both stochastic and full gradient methods for finite-sum minimization problem. However, the existing rate-of-convergence analysis for HSGD is made under with-replacement sampling (WRS) and is restricted to convex problems. It is not clear whether HSGD still carries these advantages under the common practice of without-replacement sampling (WoRS) for non-convex problems. In this paper, we affirmatively answer this open question by showing that under WoRS and for both convex and non-convex problems, it is still possible for HSGD (with constant step-size) to match full gradient descent in rate of convergence, while maintaining comparable sample-size-independent incremental first-order oracle complexity to stochastic gradient descent. For a special class of finite-sum problems with linear prediction models, our convergence results can be further improved in some cases. Extensive numerical results confirm our theoretical affirmation and demonstrate the favorable efficiency of WoRS-based HSGD.

hybrid stochastic gradient descent, new insight, with-replacement sampling and convexity, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Filters

Collaborating Authors

with-replacement sampling and convexity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity

Reviews: New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity

New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity